NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Can Domain Experts Rely on AI Appropriately? A Case Study on AI-Assisted Prostate Cancer MRI Diagnosis

Chen, Chacha; Liu, Han; Yang, Jiamin; Mervak, Benjamin M; Kalaycioglu, Bora; Lee, Grace; Cakmakli, Emre; Bonatti, Matteo; Pudu, Sridhar; Kahraman, Osman; et al (June 2025, ACM)

Despite the growing interest in human-AI decision making, experimental studies with domain experts remain rare, largely due to the complexity of working with domain experts and the challenges in setting up realistic experiments. In this work, we conduct an in-depth collaboration with radiologists in prostate cancer diagnosis based on MRI images. Building on existing tools for teaching prostate cancer diagnosis, we develop an interface and conduct two experiments to study how AI assistance and performance feedback shape the decision making of domain experts. In Study 1, clinicians were asked to provide an initial diagnosis (human), then view the AI's prediction, and subsequently finalize their decision (human-AI team). In Study 2 (after a memory wash-out period), the same participants first received aggregated performance statistics from Study 1, specifically their own performance, the AI's performance, and their human-AI team performance, and then directly viewed the AI's prediction before making their diagnosis (i.e., no independent initial diagnosis). These two workflows represent realistic ways that clinical AI tools might be used in practice, where the second study simulates a scenario where doctors can adjust their reliance and trust on AI based on prior performance feedback. Our findings show that, while human-AI teams consistently outperform humans alone, they still underperform the AI due to under-reliance, similar to prior studies with crowdworkers. Providing clinicians with performance feedback did not significantly improve the performance of human-AI teams, although showing AI decisions in advance nudges people to follow AI more. Meanwhile, we observe that the ensemble of human-AI teams can outperform AI alone, suggesting promising directions for human-AI collaboration.
more » « less
Free, publicly-accessible full text available June 23, 2026
Pragmatic Radiology Report Generation

Nguyen, Dang; Chen, Chacha; He, He; Tan, Chenhao (December 2023, Proceedings of the 3rd Machine Learning for Health Symposium)

When pneumonia is not found on a chest X-ray, should the report describe this negative observation or omit it? We argue that this question cannot be answered from the X-ray alone and requires a pragmatic perspective, which captures the communicative goal that radiology reports serve between radiologists and patients. However, the standard image-to-text formulation for radiology report generation fails to incorporate such pragmatic intents. Following this pragmatic perspective, we demonstrate that the indication, which describes why a patient comes for an X-ray, drives the mentions of negative observations. We thus introduce indications as additional input to report generation. With respect to the output, we develop a framework to identify uninferable information from the image, which could be a source of model hallucinations, and limit them by cleaning groundtruth reports. Finally, we use indications and cleaned groundtruth reports to develop pragmatic models, and show that they outperform existing methods not only in new pragmatics-inspired metrics (e.g., +4.3 Negative F1) but also in standard metrics (e.g., +6.3 Positive F1 and +11.0 BLEU-2).
more » « less
Full Text Available
Selective Explanations: Leveraging Human Input to Align Explainable AI

https://doi.org/10.1145/3610206

Lai, Vivian; Zhang, Yiming; Chen, Chacha; Liao, Q. Vera; Tan, Chenhao (September 2023, Proceedings of the ACM on Human-Computer Interaction)

While a vast collection of explainable AI (XAI) algorithms has been developed in recent years, they have been criticized for significant gaps with how humans produce and consume explanations. As a result, current XAI techniques are often found to be hard to use and lack effectiveness. In this work, we attempt to close these gaps by making AI explanations selective ---a fundamental property of human explanations---by selectively presenting a subset of model reasoning based on what aligns with the recipient's preferences. We propose a general framework for generating selective explanations by leveraging human input on a small dataset. This framework opens up a rich design space that accounts for different selectivity goals, types of input, and more. As a showcase, we use a decision-support task to explore selective explanations based on what the decision-maker would consider relevant to the decision task. We conducted two experimental studies to examine three paradigms based on our proposed framework: in Study 1, we ask the participants to provide critique-based or open-ended input to generate selective explanations (self-input). In Study 2, we show the participants selective explanations based on input from a panel of similar users (annotator input). Our experiments demonstrate the promise of selective explanations in reducing over-reliance on AI and improving collaborative decision making and subjective perceptions of the AI system, but also paint a nuanced picture that attributes some of these positive effects to the opportunity to provide one's own input to augment AI explanations. Overall, our work proposes a novel XAI framework inspired by human communication behaviors and demonstrates its potential to encourage future work to make AI explanations more human-compatible.
more » « less
Full Text Available
Machine Explanations and Human Understanding

https://doi.org/10.1145/3593013.3593970

Chen, Chacha; Feng, Shi; Sharma, Amit; Tan, Chenhao (June 2023, ACM)
Towards a Science of Human-AI Decision Making: An Overview of Design Space in Empirical Human-Subject Studies

https://doi.org/10.1145/3593013.3594087

Lai, Vivian; Chen, Chacha; Smith-Renner, Alison; Liao, Q. Vera; Tan, Chenhao (June 2023, ACM)
LEARNING HUMAN-COMPATIBLE REPRESENTATIONS FOR CASE-BASED DECISION SUPPORT

Liu, Han; Tian, Yizhou; Chen, Chacha; Feng, Shi; Chen, Yuxin; Tan, Chenhao (May 2023, The Eleventh International Conference on Learning Representations)

Algorithmic case-based decision support provides examples to help human make sense of predicted labels and aid human in decision-making tasks. Despite the promising performance of supervised learning, representations learned by supervised models may not align well with human intuitions: what models consider as similar examples can be perceived as distinct by humans. As a result, they have limited effectiveness in case-based decision support. In this work, we incorporate ideas from metric learning with supervised learning to examine the importance of alignment for effective decision support. In addition to instance-level labels, we use human-provided triplet judgments to learn human-compatible decision-focused representations. Using both synthetic data and human subject experiments in multiple classification tasks, we demonstrate that such representation is better aligned with human perception than representation solely optimized for classification. Human-compatible representations identify nearest neighbors that are perceived as more similar by humans and allow humans to make more accurate predictions, leading to substantial improvements in human decision accuracies (17.8% in butterfly vs. moth classification and 13.2% in pneumonia classification).
more » « less
Using a neural network – Physics-based hybrid model to predict soil reaction fronts

https://doi.org/10.1016/j.cageo.2022.105200

Wen, Tao; Chen, Chacha; Zheng, Guanjie; Bandstra, Joel; Brantley, Susan L. (August 2022, Computers & Geosciences)

Full Text Available

Search for: All records